Azure Databricks test cases and Git commit

Anshal 2,006 Reputation points
2024-05-14T09:05:12.2566667+00:00

Hi friends, we have testing our test cases in a testing environment, and these are many tests, and want to test them as per test uses- cases and before committing into Git. Since there are two many of them I do not want to do it manually for each use-case and I want it to be an automated process before the commit in the Prod. What is the best strategy in this scenario and how to approach it correctly?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,975 questions
0 comments No comments
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 38,206 Reputation points Microsoft Employee
    2024-05-15T04:58:04.6166667+00:00

    Hi Anshal
    Thank you for posting query in Microsoft Q&A Platform.

    To automate the testing process before committing to Git, you can use Azure Databricks' built-in testing framework, which allows you to write and run automated tests for your notebooks and jobs. Here's a high-level approach you can follow:

    Write your test cases in Databricks notebooks using the built-in testing framework. You can organize your test cases by test suites, which can be run independently or together.

    Set up a Databricks job to run your test suites automatically. You can schedule the job to run at a specific time or trigger it manually.

    Configure the job to fail if any of the test suites fail. This will prevent you from committing code that has failing tests.

    Set up a pre-commit hook in Git to run the Databricks job before allowing a commit to be made to the production branch. This will ensure that all tests pass before code is deployed to production.

    By following this approach, you can ensure that your code is thoroughly tested before it is deployed to production, and you can catch any issues early in the development process.

    Hope this helps. Please let me know if any further queries.


    Please consider hitting Accept Answer button. Accepted answers help community as well.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Amira Bedhiafi 16,306 Reputation points
    2024-05-14T14:13:58.7033333+00:00

    I think you need to set up a strategy based on 4 main points

    1. Categorize your test cases based on their functionalities and use cases, try Divide tests into unit tests, integration tests, and end-to-end tests
    2. Use a testing framework that supports automation in Databricks (like pytest or unittest for Python)
    3. Integrate your testing process into a CI/CD pipeline.
    4. Use Git hooks to ensure tests run before code is committed.

    More links :

    https://learn.microsoft.com/en-us/azure/databricks/repos/

    https://docs.databricks.com/en/repos/index.html

    1 person found this answer helpful.
    0 comments No comments